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Abstract. Motivated by giving a meaning to "The probability that a ran- 
dom integer has initial digit d", wc define a URI-set as a random set E of 
natural integers such that each n > 1 belongs to E with probability 1/n, in- 
dependently of other integers. This enables us to introduce two notions of 
densities on natural numbers: The URI-density, obtained by averaging along 
the elements of E, and the local URI-density, which we get by considering 
the fe-th element of E and letting fc go to oo. We prove that the elements of 
E satisfy Bcnford's law, both in the sense of URI-density and in the sense of 
local URI-density. Moreover, if 61 and 62 are two multiplicativcly independent 
integers, then the mantissae of a natural number in base bi and in base 62 are 
independent. Connections of URI-density and local URI-density with other 
well-known notions of densities are established: Both are stronger than the 
natural density, and URI-density is equivalent to log-density. We also give a 
stochastic interpretation, in terms of URI-set, of the i?oo-density. 



1. Introduction 

1.1. Benford's law and Flehinger's theorem. Benford's law describes the em- 
pirical distribution of the leading digit of everyday-life numbers. It was first discov- 
ered by the astronomer Simon Ncwcomb in 1881 [11] and named after the physicist 
Franck Benford who independently rediscovered the phenomenon in 1938 [1]. Ac- 
cording to this law, the proportion of numbers in large series of empirical data with 
leading digit d £ {1, 2, . . . , 9} is log^o (1 + ^/d). More generally, defining the man- 
tissa ^(x) of a positive real number x as the only real number in [1, 10 [ such that 
X = ^(x)\^^ for some integer fc, Benford's law states that for any 1 < a < /3 < 10, 
the proportion of numbers whose mantissa lies in [a, /3] is logj^g /? — logj^p ^■ 

Giving Benford's law a mathematical meaning requires to formalize the notion of 
"everyday-life numbers" , which is far from obvious. However there have been many 
attempts to explain mathematically the ubiquity of this distribution in empirical 
datasets. One of them is Betty J. Flehinger's theorem, published in 1965 in an 
article entitled On the probability that a random integer has initial digit A [6]. It 
occurred to Flehinger that the most natural set of numbers on which we should 
verify Benford's distribution is the whole set of positive integers. Unfortunately, 
defining 

n 

(that is the proportion of integers between 1 and n with leading digit d), we see 
that the sequence P^id) has no limit as n ^ 00: It oscillates over longer and longer 
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periods. Flehinger's idea was then to seek the hmit by iteration of the process of 
Cesaro averaging: She inductively set 

1 " 

and proved that 

(1) Um Umsupf^'C^) = 1™ hminf = logj^f, ( 1 + 3 ) , 

k—ioo fe— >oo ri— !-oo y a J 

which is the proportion predicted by Benford's law. (Donald Knuth generalized 
Flehinger's theorem to the distribution of the whole mantissa in 1981 [9].) 

In spite of its title, Flehinger's article has no probabilistic content. A good 
reason is that there is no way of picking an integer uniformly at random in the set 
of all natural numbers. The first motivation for the present work was nevertheless 
to translate Flehinger's theorem in the context of probability theory: How can we 
interpret the probability that a (random) integer has a given initial digit? Our 
purpose is thus to give a meaning to the sentence "An integer picked uniformly at 
random has such a property" . 

1.2. Roadmap of the paper. We construct in Section 2 a random infinite set E 
of integers, which we call a URI-set, such that averaging along the elements of E 
reflects the expected behaviour of a random integer. This random set enables us 
to introduce two notions of densities on natural numbers: The URI-density (see 
Section 3.1), obtained by averaging along the elements of E, and the local URI- 
density (sec Section 5.2), which we get by considering the fc-th element of E and 
letting k go to 00. We prove that the elements of E satisfy Benford's law, both 
in the sense of URI-density (Theorem 3.5), and in the sense of local URI-density 
(Theorem 5.1). Our point of view also enables to consider simultaneously the 
mantissae of a number in different bases, and in particular we prove a result which 
can be interpreted as follows: If hi and 62 are two multiplicatively independent 
integers, then the mantissae of a natural number in base hi and in base 62 are 
independent (Theorem 4.1). Connections of URI-density and local URI-density 
with other well-known notions of densities are established: We prove that both 
are stronger than the natural density (Theorems 3.3 and 5.2), and that in fact 
URI-density is equivalent to log-density (Theorem 3.6). We finish in Section 6 by 
giving a stochastic interpretation, in terms of URI-set, of the iJoo-dcnsity used in 
Flehinger's theorem, and by raising some open problems. 

The construction of the random set of integers is inspired by a previous article 
by the same authors [8], where a probabilistic proof of Flehinger's theorem was 
provided. We summarize this proof in Section 2.1 

Acknowledgements. The authors are grateful to Bodo Volkmann for stimulating 
questions. 

2. Uniform random set of integers 

2.1. Flehinger's theorem through Markov chain. We introduce a homoge- 
neous Markov chain {Mk)k>o taking values in [1, 10[, defined by its initial value Mq 
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(which can be deterministic or random) and the following transition probability; 
for any Borel set S C [1, 10[, 

(2) P(Mfe+i e S\Mk = a) := P(^^(aC/) e 5*), 

where [/ is a uniform random variable in [0, 1]. 

Let us denote by /i^ the probability distribution on [1, 10[ given by Benford's 
law: For any 1 < t < 10, 

^^([1,^]) :=logiot. 
It is proved in [8] by a standard coupling argument that 

• fi^ is the only probability distribution on [1, 10[ which is invariant under 
the probability transition (2); 

• Whatever choice we make for the initial condition Afo, we have for any 
Borel set S C [1, 10[ and for all A: > 1 

(3) \F{^heS)-^l^{S)\ < (^^ 

A connection is made between the quantities P,'i{d), n > 1, and the k-th step of 
our Markov chain: It is established in [8] that for all a G [1, 10[ and all fc > 1, 



Um P, 



faio^j id) = P{Mk e[d,d+l[\ Mo = a) . 



A proof of (1), with an estimation of the speed of convergence, follows: We get for 
all fc > 1 



(4) 



and 



liminfP,^(d)-^Bf[d,d+l[ 

n— >cx3 \ 



Um sup P,; (d) - A* [c^, + 1 [ 



< 



< 



9 ^ 



10 
9 

10 



2.2. Construction of a random set of integers. We can interpret the Markov 
chain (Affc) as the sequence of mantissae of positive random variables Xk, where 
the sequence (Xk) is itself a Markov chain such that, given Xq^ • ■ • ? X^j Xk-{-i is 
uniformly distributed in ]0,Xfe[. Then X := {Xk, fc > 0} is a discrete random set 
of real numbers which satisfies the following property: For any t > 0, conditionally 
to the fact that Xn]t, oo[^ 0, max(Xn]0,i]) is uniformly distributed in ]0,t], and 
independent of Xr]]t, oo[. 

Our idea is thus to imitate the structure of this random set of reals, but inside 
the set of natural numbers. We are looking for a random infinite set of integers E 
satisfying 

(U) for all n > 1, max{E n {1, . . . , n}) is uniformly distributed in {1, . . . , n}; 
(I) for all n > 1, ma.x{E n {1, . . . , n}) is independent of £^ n {?! + 1, n + 2, . . .}. 

For such a random set E, we must have by (U), for each n > 1, 

P{7i G E) =P(max(£;n{l,...,n}) = n) ^ l/n, 

and (I) implies that all events (n S E) are independent. 

Conversely, picking elements of E using independent Bernoulli random variables, 
with P(n S E) — l/n for each n > 1 gives a random set satisfying the required 
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conditions. Indeed, for each j G {1, . . . , n}, we get 

Pfmax(£;n {!,..., n}) = j) = pf j £ E,j + 1 E, . . . ,n E 



I j n-1 _ 1 
j j + 1 n n' 
Observe also that, with probabihty 1, the cardinaUty of E is infinite. 

Because of the uniformity property (U), such a random set E appears as a good 
way to modehzc the uniform distribution in the set of natural numbers and will 
therefore be referred to as a set of uniform random integers, or URI-set. 

3. URI-DENSITY AND BENFORD'S LAW 

From now on, E denotes a URI-set, and we denote its ordered elements by 

E = {Ni = l<N2<...<Nk <...}. 

For each n > 1, we set En :^ E O {1, . . . , n}. 

It will be useful to give the following estimation of \En\. 

Lemma 3.1. 

\En I a.s. ^ 



In 71 n-i-oo 

We recall Theorem 12 page 272 in Petrov's book [12]: 

Theorem 3.2. Let {Zn) be a sequence of independent centered real-valued random 
variables Zn ■ If in t ^^^i 

n>l 

for some p, 1 < p < 2, then 

Sfcl a.s 



^ 0. 



Proof of Lemma 3.1. We consider the independent centered random variables Z„ 
1b (^) ~ l/n. Since 

E\Z?:] 1\ 1 



< OO, 



(Inn)^ ^-^ \ n J n(lnrt)" 

ri>l ^ ' n>l ^ / \ / 

we get by Theorem 3.2 

(5) ^T^ = 7^El-W-T^El/.7^0. 

Inn inn-'^ — ' inn ■'^ — ' n^cx) 

This concludes the proof of the lemma. □ 

3.1. URI-density. We say that a subset A of the set of natural numbers has 
URL density a if 



1 



n 



Note that an equivalent formulation is 
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As we expect, the URI-density generalizes the notion of natural density. 

Theorem 3.3. Let AcZ+. If- YJ^_, IaU) > a, then A has URI-dcnsity a. 

Proof. When considering the elements of En, it will be convenient to order them 
backwards: 



£;„ = £;n{i,...,n} = {y/"'>yi")>...>i^£i = i}. 

For each n > 1, let a„ := 1a('^)- We are going to prove the result in the form 



^ a. 



We split the above average as 



(7) 

where 



\E„\ |E„| 



Ytl...,Yr\ {2<^<\En\), 
{^ = l). 



We first deal with the second term of (7). By Properties (U) and (I), f/"'' is 
uniformly distributed in {1, . . . , n}, and conditionally to (y^^^I, . . . , f/"'' j , l^*-"-* is 

uniformly distributed in |l, . . . , f/^] — l|, as long as f/^' > 1. Hence, 



yA" -1 



n 'i-l 

K-^-Y^a, and iff = ^ a„ {2<^<\En\). 



By hypothesis, as soon as f/_!\^ is large, ii"" is close to a. For any fixed e > 0, 
the number of i's such that |if" — a\ > £ is bounded independently of n. Since 
\En\ -> oo a.s., it follows that 



\EJ ^ ' 



^ a . 

Let us now turn to the first term of (7). Define for any n > 1 and i > 1 

' ■ \0 ' {l> \En\). 

In order to prove (6), it remains to show that 



(8) 



1 y Q 

\E„ ^ n^oo 



Using a standard method, we first prove that (8) holds along a subsequence, then 
we control the oscillations to conclude. Consider := [exp(m^)J. By lemma 3.1, 
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convergence along the subsequence (rtm) amounts to 

Lin"- J 

AM := V A^- . 

[ln?l„J ^ m^oo 

Observe now that the variance of A"'" is bounded, and that for i ^ j, E [A"'" A""] = 
0. Therefore, the variance of A(nm) is of order (Innm)"^ — l/ni^. By Tchebychev's 
inequaUty, 

E P {am > „.-'^) < ^ . J2 '-""o (^) < ~. 



m>l 



Hence, by Borel-Cantelh, with probabihty 1 we have A{n„i) < m for to large 
enough. This proves that 

1 

Consider now an integer n S]Tim,nm+i[. Wc can write 



Since A"" is bounded and |-En„| — |-E'n| = o{\En\), the second term on the RHS 
vanishes as m oo. Moreover, En„^ C En- Therefore each A"" in the first term 
(except A"" which has a slightly different definition) is annihilated by some 
and the first term of (9) reduces to 



^ |£„|-|£„„| + 1 ^ 



E\n _ ^ /in™ 



which goes to zero as m — >■ oo. Since we already know that the third term goes to 
zero, this proves (8). □ 

3.2. Benford's law. We say that a sequence of positive real numbers (a:„) follows 
Benford's law if for all 1 < i < 10 



1 " 

- E l-*(^.)<t ^ logio 



Remark 3.4. We recall that this is equivalent to the uniform distribution mod 1 
of the sequence (logj^Q^fc) (see e.g. [3]). Therefore, this is also equivalent to the 
fact that the sequence (l/xk) follows Benford's law. 

The following theorem shows that the elements of the URTsct E almost surely 
follow Benford's law. 

Theorem 3.5. For any I < t < 10, the URI-density of {n > 1 : ^{n) < t} is 
logio t. 
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Proof. By Duncan's work [5] , this result can be viewed as a corollary of the equiv- 
alence between URI-density and log-density (see Theorem 3.6 below). However we 
provide a direct proof of it, in which useful ideas will be presented. 

It is convenient to consider a coupling of the URI-set E and its continuous analog 
defined as follows: Let ^ be a Poisson process on R;*^ with intensity 1/x. It can be 
viewed as a random set of points. For any interval /, let ^/ denote the number of 
points in / n ^: ^7 is Poisson distributed with parameter Jj From ^, define the 
random set E as the set of integers n > 1 such that ^]n-i,n] ^ 1- Since the random 
variables {S.]n-i,n])n>i sre independent and 

P(nei;) = i-p(ei„_i.„i = o) = i, 

£■ is a URI-set. 

The process ^ satisfies a property analogous to Property (U): For any a > 0, the 
largest point of [0, a] n ^ is uniformly distributed in [0, a]. Indeed, for any t < a 



»(max([0, a] n < ^) = P {C[ua] = O) = 



t 



Let us order backwards the points of ^ n [0, 1]: 1 > Yi > ^2 > • • •• Conditionally to 
Yfe, Yk+i is uniformly distributed in [0,1^]. By Proposition 3.1 of [8], the mantissae 
(^(Yfc)) constitute a Markov chain whose unique invariant distribution is fi^ . 
Moreover, Yi being uniformly distributed in [0,1], the distribution of ^(Yi) is 
the normalized Lebesgue measure on [1, 10[, hence is equivalent to /i^. By the 
pointwise ergodic theorem, we have for any 1 < t < 10 



1 



a.s. , 

> logins- 



Consider now the points of ^n]l,-foo[: Xi < X2 < ... They follow the same 
distribution as {1/Yi,l/Y2, . . .). By Remark 3.4, we deduce that 

1 " 

(10) -'^l^yix.A^t > login S. 



77 ^ — ^ n— J-oo 



Now observe that 



m]n-l,n] > 2 = ^ - + log < 



n>l n>l 



hence by Borel-Cantelli, with probability one there is only a finite number of ri's 
such that C]ri-i.?i] > 2. Coming back to the URI-set E = {Ni = 1 < N2 < ■ ■ 
this implies the almost-sure existence of R such that, for all large enough fc, < 
Nk — Xk+R < 1. Since Nk ^ 00 a.s., we have |.-#(A^fc) — ^{Xk+R)\ unless Nk 
be of the form lO'' (which, again by Borel-Cantelli, happens almost surely for only 
finitely many fc's). It follows from (10) that 



k=l 



a.s. , 

> login i. 



□ 



8 



ELISE JANVRESSE AND THIERRY DE LA RUE 



3.3. Equivalence with log-density. To deal with the problem of non-existence 
of natural densities, several alternative densities have been introduced. Flehinger's 
theorem amounts to considering the so-called H°° -density, obtained by iteration of 
Cesaro averages: A subset A of 2+ has H°° -density a if 

lim limsupPjj = lim liminf P,j = a, 

where the P^'s are inductively defined by po := lA{n) and P„*^+i (1/n) YTj=i Pj- 
Obviously, iJ°°-dcnsity is stronger than natural density, in the sense used by Dia- 
conis in [2]: If A has natural density a, then A has iJ°°-density a. The example 
of the set A of integer whose initial digit (when written in base 10) is 1 shows that 
ff°°-density is strictly stronger than natural density. 

Still stronger than i/°°-density is the notion of log-density. Recall that A C Z+ 
has log-density a if 

1 " 1 
V-lA(j) >a. 

Inn ^-^ 1 n^(yo 

A proof that log-density is stronger than i7°°-density can be found in [3], together 
with an example of a set A with a log-density, but for which the i7°°-density fails 
to exist. 



Theorem 3.6. Let Ac1+. 

^ ^1^0-)1^(j)-7^E1a(j)/.?^^0. 

n r) ' ^ rt.— 



\En\ ^ Inn . 

In particular, A has URI-dcnsity a if and only if A has log-density a. 

Proof. We apply Theorem 3.2: We consider the independent random variables 
Zn := lAin)(lEin) — l/n.) which are centered. Since 



^ (Inn)^ ~ V nl nflnn 

1 ^ ^ n>l ^ ^ ^ 



n> 



/ /, < ( 1 I —r, ^ < oo> 



we get 



(11) t^J^ = ^ 2. 1a(j)1^;(j) - ^ 2^ ^A{3)l3 . 

In n in n ^-^ in n ^-^ ^ ' — 



n— >-cx3 



Since, by Lemma 3.1 



a.s. 



^ 1, 



Inn ' 

we can conclude the proof of Theorem 3.6. □ 

4. Independence of mantissae in different bases 

All the previous results concerned numbers written in the basc-10 numeral sys- 
tem, but extend straightforwardlly to any integer base h. We denote by ^b{x) e 
[1, h\ the mantissa in base 6 of a positive real number x and by fj^ the probability 
distribution over [1, b[ defined by ([Ij A) •= logfe * for all 1 < i < 6. 

The purpose of this section is to prove the following theorem, which states that 
the mantissae in different bases of the elements of the URI-set E are independent, 
under some algebraic condition on the bases. 
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Theorem 4.1. Let (6i)i<i<£ be positive integers, satisfying 



(12) Vai,...,a£ e Z, 







ai ^ ■ ■ ■ — ai — Q. 



Then for any 1 < t, < b, (1 < i < £), {n e Z+ : JibM) <ti^^<i< ^} ias 
URI-density equal to ni=i logfci ^i- 

Recall that the positive integers {bi)i<i<e are said to be multiplicatively indepen- 
dent if bl^ . . . bg' = 1 where (si)i<i<£ C Z implies that Si = for all i. Note that, 
in the case i = 2, property (12) exactly means that bi and 62 are multiplicatively 
independent. To our knowledge, it is unknown whether, in the general case, multi- 
plicative independence of 61, . . . , 6^ implies property (12). This question is related 
to the so-called SchanueVs conjecture in transcendental number theory (see [10], 
p. 30-31 or [13]). 

Lemma 4.2. Let {Zk)k>i be i.i.d. random variables taking values in (M/Z)^ with 
common distribution v. Assume that the only probability distribution fx such that 
fi* = fi is the Lebesgue measure on (M/Z)^. Then the random walk {Pk)k>i 
[Zi + • • • + Zk)k>i is uniformly distributed on (R/Z)^. In other words, it satisfies: 
For all cylinder C = [ui,vi] x ■ ■ ■ x [ug, vg] where < Ui < Vi < 1, 

n i 

(13) -Eic(n.)^^n(".-«o- 

k=l 1=1 

Proof. Denote by fj, the Lebesgue measure on (M/Z)^. Let Mq be a random variable 
with law /i, independent of {Zk)k>i- Setting 

Mk := Mo + Zi + • . • + Zfc = Mo + Pk, 

we get a stationary random walk (Mfe)fe>o. Since ^ is the unique invariant measure 
under convolution by u, the stationary process (Mfe)j;>o is ergodic, and by Birkhoff 
ergodic theorem, we get that for all cylinder C = [ui,vi] x ••• x [m£,W£] where 
< u, < V,, < 1, 



f 00 

fe=l i=l 



(See e.g. [7], Corollary 2.5.2 page 38.) Therefore, we can find some jtiq G (R/Z)^ 
such that, with probability 1, for all cylinder C — [ui,vi\ x ■■■ x [ui,vi\ where 
Q < Ui < Vi < 1 are rational numbers, 

^ n (. 

- V 1c(toO + Pk) > W{vi - U^). 

fc=l i=l 

We thus obtain that, for all cylinder C with rational endpoints. 



- lc-nio{Pk) > TT(w,; - Ui) = /i(C - mo). 

71 ^ — ^ n— >oo ^ ^ 

k=l i=l 

By density of the rationals, (13) is satisfied for any cylinder C. □ 

Proof of Theorem 4-1- As in the proof of Theorem 3.5, we consider the coupling 
of the URI-set with the Poisson process ^, and we denote by • • • > X2 > Xi > 
1 > Yi > Y2 > ■■■ the points of ^. Define, for all k > 1, Uk Yk/Yk-i (where 
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Yq := 1): Then {Uk)k>i is a sequence of i.i.d. uniform random variables in [0,1], 
Sii-idYk = UiU2...Uk. 

Set Zk := (logf,^ (Uk) mod 1, . . . , log^^ (C/^) mod 1^ G (K/Z)^ and let ly be the 
common law of the ^^'s. We claim that the only probability measure fi which is 
invariant under convolution by ly is the Lebesgue measure on (R/Z)^. Indeed, for 
such an invariant measure, the Fourier coefficients must satisfy 

V(mi, . . . ,m£) e 1} , /i(mi, . . . ,mi) = 'jl{mi,. . . , mi)i){nii,. . . , me). 

We just have to check that v{mi, . . . , me) ^ 1 when (mi, . . . , me) ^ (0, . . . , 0). 

9(mi, ...,me)= [ ^-^2.{ra,t, + ■■■+m,U) ^^(^^^ _ ^ ^ 
g-i27r(mi logj,^ tiH hm* logi,^ u) 

[04] 



'[04] 

where := + • . • + jHii- ^ for (mi, . . . , m,) # (0, . . . , 0) by (12). Hence, 

'u{mi,...,me) = - — ^ 1, 
1 — i2n0 

which proves the claim. 

It follows by Lemma 4.2 that the sequence 



(logbi O^k ) mod 1 , . . . , logfc^ (Yk ) mod 1 



is uniformly distributed in (]R/Z)^, and the same is true if we replace Y^ by X^,. 
The end of the proof goes with similar arguments as for Theorem 3.5. □ 

Cassels-Schmidt-Benford sequences. Given two multiplicatively independent 
positive integers bi and 62, Bodo Volkmann defined a Cass els- Schmidt number of 
type (&i, 62) as a number which is normal in base bi but not in base 62- By anal- 
ogy, he also proposed to define a Cassels-Schmidt-Benford (CSB) sequence of type 
(&i,&2) as a sequence of positive numbers {xk) which follows Benford's law with 
respect to base 61 but not with respect to base 62- 

It turns out that it is far easier to find explicit examples of CSB sequences. 
Indeed, take '■= {b2)^ i then (x^) does certainly not follow Benford's law with 
respect to base 62. But since In 62/ In 61 is not a rational number, the sequence 
(logfij Xk mod 1) = (fcln62/ln^i mod 1) is uniformly distributed in [0, 1]. Hence 
(xfc) follows Benford's law with respect to base 61. 

As an application of Theorem 4.1 in the case ^ = 2, we get that in almost every 
URI-set, we can find a CSB sequence [xu) of type (61, 62), such that the sequence 
(^62(0;^)) follow any probability distribution prescribed in advance on [1,&2[- 



5. Local URI-density 



For the sake of simplicity, we now return to the classical base 10 numeration 
system (but obviously the following results also hold in any integer base). 
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5.1. A single element of a URI-set satisfies Benford's law. According to 
Theorem 3.6, Theorem 3.5 tm^ns out to be weaker than Flehinger's theorem. How- 
ever similar ideas to those developed in the proof can lead to somewhat stronger 
results than Theorem 3.5. 

Theorem 5.1. For all 1 < a < /S < 10 

hm p(.^{Nk) e [a, 13]) = /iS([Q, 

fe— >co \ / 

Proof. As in the proof of Theorem 3.5, we consider the URI-set E constructed from 
the Poisson process ^. For a fixed integer m, we number the points of ^njm, -|-c»[: 

^n]m,+ooh {A™ < A™ <...}. 

We also set A™ m. Observe that the process (1/A™)^^^ is again a Markov chain 
such that, given (1/A^", . . . , 1/A^"), 1/A^^ is uniformly distributed in ]0, 1/A^"[. 
It follows by (3) and Remark 3.4 that, for any Borel set S C [1, 10[, 

(14) Vfc>l, |p(.^(Ane5)-M^(5)|< (^A^ . 
We now consider the event 

Am n {^]n-l,n] < l) H f| (^]10^_1,10^] = O) . 

n>m e:10^>m 

Defining the random variable Jm as the largest index such that Nj^ < m, the 
realization of Am ensures that the shift of index between the process (A™) and the 
integers Nk that are larger than m + 1 remains constant, equal to Jm- Therefore, 
for any fc > 1 

0< Aj„+fc-A^" < 1. 

Moreover, since Am also forbids that any Nk > m be of the form lO*", we get 
(conditionally to Am) 

(15) Vfc>l, 0<^(Aj,„+fe)-^(A,") < < -. 

Note also that P{Am) ^ 1 as m — > cxo, so that choosing m large enough will enable 
us to condition with respect to Am without affecting too much the probability of 
any event. Indeed, we will make use of the following inequality, valid for any events 
A and B with P(A) > 0: 

(16) |P(i?IA)-P(S)|<M. 

Let us fix an arbitrary e > 0. We choose m large enough so that 

Conditioning with respect to J,„ which takes values in {1, . . . , m}, we get 
p(^(Afe) e [a,/?]) -//^([a,/3]) 

rn 

< P(J™ = .?) |p(.^(Afe) e [a, 13] I Jm = j) - M^([a, /?]) 
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Then we write, for any 1 < j < m, 



<Di+D2 + D3 + D4 



where 
Di : 

D2 ■■ 

D3 : 
D4 : 



Vi^^iNk) e [a,/3] I Jm = J j - P(^^(Affe) e [a, /3] | A™, J„ 
p(^(iVj„,+fc_,) G [a, /3] I A™, J„ = 

P(^(X^1^.) G [a,/3] I A^, J™ = .7) - P(^(X^1^.) G [a,/?] | J„ = j 
P(^(xr_,) G [a,/?] I J,n = j) - ^^''{[a,P]) 



Observe that Am, which is measurable with respect to the Poisson process on 
]m, +oo[. is independent of (Jm — j), which is measurable with respect to the 
Poisson process on ]1,to]. Hence, using (16) and (17), we can bound Di + by 
2e. 

Again, XJP^j is measurable with respect to the Poisson process on ]m, +00 [, hence 
is independent of (J^ = j). By (14), the contribution of D4 can be bounded by 
(9/10)'^"^ hence by (g/lO)*^"'". 



It remains to deal with D2. 
use (15) to get 



Since everything is conditioned on Am, we can 



D2 < P(^(Xr_j) G [a - 10/m, a] \ A„„ J,n = j) 

+ P(.^(X™^.)G [/3-10/m,/ 
Using again (16), (17) and (14) yields 



D2 < 2(9/10)''-'" + 2£ + At" ([a - 10/m, 0]) + ^" ([/? - 10/m,/3]) < 2(9/10)'=-" + 4e. 

□ 

Remark that the statement of Theorem 5.1 would not hold if we replace the 
interval [a, /S] by any Borel set S. Indeed, since is an integer, the probability 
that its mantissa belong to the set of irrational numbers is zero. In other words, 
the convergence of the distribution of ^{Nk) to Benford's law is only a weak 
convergence. However, if we denote by X}~ the largest point of the Poisson process 
which is smaller than Nk, then Xk €]Nk — l,Nk], and the distribution of ^^{Xk) 
converges to Benford's law in total variation norm. 

5.2. Local URI-density is stronger than natural density. In view of Theo- 
rem 5.1, it is natural to introduce the local URI-density of A C Z+ as the limit, 
when k 00, of P(A''fc G A) (whenever the limit exists). The purpose of this section 
is to prove that local URI-density is stronger than natural density: 

Theorem 5.2. If A C Z+ possesses a natural density, then the local URI-density 
of A exists and coincides with its natural density. 

In fact, local URI-density turns out to be strictly stronger than natural density, 
since Theorem 5.1 proves the existence of sets A without natural densities but for 
which the local URI-density exists. 
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The proof of Theorem 5.2 is based on the following lemma. 

Lemma 5.3. Let {Pk)k>i be a sequence of probability distributions on Z_|- satis- 
fying: For any k > I, there exists Uk sucli that 

• The map n i-> Pkin) is non-decreasing on {1, . . . , rife} and non-increasing 
on {nfc,nfc + l,...}; 

• For any integer m > 1 , 

hm = 1. 

k^oo Pk[nk) 

Then, if A C possesses a natural density a, liuik^ca Pki^) = a. 




Figure 1 . Profile of Pk 



Proof. Let us fix e > 0. Let 9 e]0, 1[, close enough to 1 so that {I — 6)/9 < e. Let 
m € Z_|_ be such that m > 1/e and such that, for any n > m, 



1 " 

— e ]a - e, a + £[. 

n ^ — ^ 



" 1 



Wc choose k large enough such that 
(18) 

Pk(nk) 

By a Fubini argument, wc can write Pk{A) as 



(19) Pk{A)= \{n e A : Pk{n) > t}\ dt. 

Jo 

We split the integral into two terms 

/•ePkirik) l-Pkink) 

h := \{ne A: Pk{n) > t}\ dt and I2 := / \{n e A : Pk{n) > t}\ dt. 



Observe that 



ePkiuk) \{n e Z+ : Pk{n) > ePk{nk)}\ 

< Pk {\{n e Z+ : Pkin) > 0Pk{nk)}\) < I. 
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Therefore 

/2 < (1 - 0)Pk{nk) \{n e Z+ : Pk{n) > 9Pk{nk)}\ < ^ < e. 

Let us turn to the estimation of Ii. By the hypothesis on the variations of Pk{n), 
for any < t < Pk{nk), there exist n^^^{t) < Uk < fi,nax(^) such that 

{n : Pkin) > t} = • ■ • , ?^Lx(^)} • 

(See Figure 5.2.) We can rewrite Ii as 



("maxW-"mi„W + l)¥'W 



where 



We prove that, for < t < 6Pk{nk), f{t) is close to the natural density of A: 
By (18), for any Q <t < dPk{nk), we have n'^^^^{t) > muk- Thus 

1 < — maxW ^ ^ 



<ax(i) - <in(i) + 1 - m - 1 - 1 - e' 
and 

Since n^^^^(t) > to, 

- ^ l^(n) G]a-e,a + e[, 

""maxV'y 

It follows that for < i < OP^ink), a - 2s < ip{t) < {a + e)/{l - 
Hence we get the following estimation: 

{a - 2e)h <Ii< h, 

1 — e 



where 



("max(0 - "mi„(0 + l) dt. 



Using (19) with A = Z+, we get 



('^max(i) - "mi„(0 + l) = Pfc (Z+ ) = 1 , 



and by the same argument as for the estimation of I2 , we have 

rPkirik) . , I- ft 

/ ^max (0 " "mi„ (*) + dt < — 

hence 1 - e < I3 < 1. □ 



< e. 
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Observe that the second condition in the lemma is crucial. Indeed, the sequence 
of binomial distributions of parameter p g]0, 1[ defined by 



satisfies the first assumption of the lemma. However there exists a set A possessing 
a natural density, but for which Bk{A) fails to converge to this natural density as 
fc ^ oo (see [2], Theorem 3 page 25). 

Proof of Theorem 5.2. Defining Pk{n) ^{Nk — n), we have to check the hy- 
potheses of Lemma 5.3. We start by establishing an induction formula for Pk{n). 
RecaU that P(iVfe = n) = ii n < k, and that ¥{Ni = n) = For 2 < k < n, 

wc decompose 

¥{Nk = n) = ¥{Nk - n, Nk-i = n - 1) + V{Nk = n, Nk-i <n-2). 

The first term in the RHS is {l/n)F{Nk-i = n — 1). The second term can be written 
as 

¥{Nk = n,Nk-i <n-l)^-(l F{\En-2\ = k - 1) 

n \ n — 1 J 

= !Lil^_mE^_^\ = k-i) 

n n — 1 

n-2 , 

= P iVfc = n - 1 . 

n 

This yields, for any 2 < k < n, 

n — 2 1 

(20) Pk{n) = Pk{n - 1) + -Pk-i{n - 1). 

n n 

For 2 < fc < n — 1, dividing by Pk{n — 1) wc obtain 

(21) 

Pfc(n-l) n\ J 

where 

fn{k) '.— TT—, -T— . 

Pk(n - 1) 

Wc prove by induction on n > 4 that k G {2, . . . , ri — 1} n> fn{k) is a non-decreasing 
function. Observe that fi{2) = and /4(3) = 1, hence /4 is non-decreasing. Assume 
that is non-decreasing for some n > 5. Using (20), we get for 3 < fc < n — 1 

71—3 1 

"'p,_i(n-2) + -Pk-2{n-2) 



^Pfc(n-2) + ^-Pfc_i(n-2) 



n—1 n—1 
li 3 < k < n — 2, we get 

1 



/„-i(fc) 

Hence, by induction, /„ is non-decreasing on {2, . . . , n — 2}. Moreover, for k = n — 1, 
since P„_i(n — 2) = 0, 

fn{n - 1) = n - 3 + fn-i{n - 2) > n - 3 + /„_i(n - 3) > /„(n - 2). 
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Observe also that /n(2) = and /n(n — 1) = (n — 3){n — 2)/2 for all n > 4. Hence, 
for all n > 5, there exists an integer fc„ such that fn(k) < 2 for 2 < fc < fc„ and 
fn{k) > 2 for A: > fc„. Since — 1) < /„_i(fc) for any 3 < fc < n — 2, we get 

by (22) that /„(fc) < /„^i(fc), which proves that n i-^ fc„ is non-decreasing. 

For any fixed A: > 3, let be the smallest integer n such that fc„ > k. By (21), 
n 1-^ Pk(n) is non-decreasing upto n/j and non- increasing after n^. Note that nj- 
exists, otherwise n i~> Pk(n) would be non-decreasing, which is obviously impossible. 
Therefore, the first hypothesis of Lemma 5.3 is satisfied. 

For all fc > 3, observe that Uk is characterized by the following: 

(23) /„Jfc)<2, and/„,_i(fc)>2. 

To check that (Pk) satisfies the second hypothesis, we need precise estimations of 
fn{k). We start by establishing a formula for Pk{n). Observe that for all 1 < j < n, 

F{Nk = n\Nk-i = j) is equal to 

? Tl — 2 1 ? 

¥{j + liE,...n-liE,neE) = ^... ^ / 

J + 1 n — In n[n — 1) 

Hence, by conditioning, Pk{n) = V{Nk ~ n) can be rewritten as 

n^k = n\Nk-i = jk-i)nNk-i - jk-i\Nk-2 = Jk-2) 

2<j2<-<]k-l<n~l 

••• P(A^3 = J3|A^2 =.72)P(A^2 =J2) 

which yields 

" 2<j2<-in ,<„_i'^('^-l)ife-l(ife-l-l)"'^3(j3-l)j2(j2-l) 

-1 - ^ 



n(n — 1) ^-^ ik-l — 1 73 ~ 1 72 — 1 

y \ . 

We use this formula to estimate the denominator in the definition of fnik): 



where 



Observe that 



(n — l)(n — 2) ^-^ . . . jk-^ 

= 7 TY? ^ XI ~~ '■ C'(j2, • ■ • , jfc-2), 

n — lln — 2) ■'^ 7273---7fc-2 

C(j2,...,Jfc-2) 51 \- 

l<j<n-3 
j?{32 Jfc-2> 

E -<C(J2,...,J._2)<T— ^ E 



fc-2<j<n-3 l<J<n-3 ■' 

which gives the following estimation 
(24) _l^</.(., = ^.l(!i^< 



Sl<j<n-3 i -Pfc("- 1) Sfc-2<j<Ti-3 j 
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Let TO > 1. Applying this estimation to fnt-i and fmuk, we get 

< fn,-i{k) - /„„,(fc) < ^"^ r - ^ — r- 

Z^k-2<i<nk-i j Z^l<j<7nnk-3 j 

The RHS of the above inequaUty can be written as a product AB, where 

^ k 2 J D X-^l<j<mnfc— 3 J X-^/c— 2<j <nfc — 4 j 

A := — J and is := — j 

2^fc-2<j<nfc-4 j Z^l<j<mnk-3 j 

By (24) and (23), we get 

(25) ^ 1 < 2, 

which ensures that A is bounded (say, by 4). Moreover, an easy computation shows 
that B KT^j which goes to as A: -> cxd by (25). Recahing that /„^._i(A:) > 2 
by (23), the above estimations prove the following property: For any e > 0, for k 
large enough, fn{k) > 2 — e for each Uk < n < mnk- For such k, we get by (21) 

n n (i+i(/„w-2))>(i-^' 

^ + l ri=nfc + l ^ ' ^ 



On the other hand, we know that Pk{nk) > Pkimrik), which proves that 

lim = 1. 

fc->oo Pfc(nfc) 



□ 



6. Open problems and discussion 

6.1. Connection with other densities. We conjecture that the existence of local 
URI-density implies the existence of URI-density (and, in this case, that both 
coincide) . 

It is not clear either whether local URI-density is equivalent to the i/°°-density 
used by Flehinger. However we can provide the following interpretation of the 
ff°°-density in terms of our URI-set. Recall that in the proof of Theorem 3.3, we 
ordered the elements of En = E C\ {\, . . . ,n] backwards: 

£;„ = {r/">>yf)>...>i^gl = i}. 

Proposition 6.1 (Stochastic interpretation of i/°"-density). For A C Z+, A has 
H°° -density a if and only if 

lim liminf P ( f/"^ e a] = lim lim sup P ( f/"^ e a] ^ a. 

Proof. For each n > 1, we introduce a non-increasing sequence of random integers 
(i^'-"-'),i>i, with the following distribution: 

• Y]^^"'' is uniformly distributed in {1, . . . , n}, 

• conditionally to Y"^*-"^ , • . . , 5^/"^ , the random variable Y^j^l is uniformly dis- 
tributed in {1,...,?/"^}. 
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For A C Z+, we write P (^y^^"' € as 
i<yk<yk-i<-<yi<n 



which yields 



, n ^ yi Vk-1 



n yi yk-i 
vi=i y2=i yk=i 

We recognize used in the definition of the i/°°-density (see Section 3.3). Now, 
we observe that 



e a) = P (y^f"^ e A\D^'^ , wlrere > . . . > Y^ 



It remains to prove that, for any fixed fc, F{D^^) — > 1 as n — >■ oo. Fix e > and 
choose 6 > such that (1 — 36)'' > 1 — e. Observe that whenever n > 1/S, the 
proportion of integers i € { 1 , . . . , n} such that d<i/n<l — dis larger than 1 — 3(5. 
Now, if rt > l/S'', we have 

P(^^ eRi-5[, ^eRl-5[,...,^eRi-5[j >(i-3,5)'=>l-e. 

Hence, P(L»;!) > 1 - e. □ 

6.2. Conditional URI-density. Let P be a subset of Z+ with X^pef ^/P ^ 
so that the cardinality of P n be almost surely infinite. We have two ways to 
define the URI-density of A conditioned on P. First, by averaging over PnE, and 
consider (whenever it exists) the almost-sure limit of 

Second, by numbering the elements of P = {pi < p2 < ■ ■ ■ < Pn < • • • } and 
averaging over the random subset of P 

{PNi < PN2 < ■ ■ ■ < Puk < ■ ■ ■}, 
that is by considering (whenever it exists) the almost-sure limit of 



1 " 

-J2u{p^^). 



n 

fe=i 



Question: are these two definitions equivalent? 

6.3. Asymptotic independence of successive elements in E. Another ques- 
tion concerning the URI-set E is the following: consider A^B (Z and assume 
that for both A and B we can define the density d{A) and d{B) (these could be the 
natural densities, the URI-densities, the iJoo-densities or maybe some other notions 
of densities). Under which condition do we have 



1 " 

- V lA{Nk)lB{Nk+i) d{A)d{B) ? 



n 

k=l 
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We conjecture that it is true when both d{A) and d{B) are natural densities. 
However this is certainly not true for all A and B with URI-densities: As a coun- 
terexample, consider the set A of integers with leading digit 1 and the set B of 
integers with leading digit 9. 

But what happens if for example d(B) is the natural density whereas d{A) is 
only the URI-dcnsity? 

6.4. Iterated URI-density. Diaconis defined in [2] the iterated log-density: For 
a subset A of Z+, set 

L{A,n,l) :=^^il^(j), 
Inn ^-^ 7 

and inductively for all £ > 2: 

L{A,n,i) :=^^iL(A,j,£-l). 
ln?i ^-^ 7 

The set A has l-th log-density a if the limit of the above exists as n — > oo and is 
equal to a. In fact, this notion does not yield a new density, since Diaconis proved 
that A has an ^-th log-density if and only if A has a log-density (and then, of course, 
both coincide). Then he proposed to define the Lao-density, which extends the log- 
density in much the same way as 77oo-density extends natural density: Consider 
lim^^oo liminf„_yQo L{A, n, £) and lim^^oo lim sup^^j.^^ L{A, n, i). If the two limits 
are equal, call their common value the Loo-density of A. As shown in [4], Loo- 
density is strictly stronger than log-density. 

It is natural in this context to study iterations of the URI-density, which can 
be defined as follows: Let (^E^^'^)^^^ be a sequence of independent URI-sets, and 

denote the random elements of E^^^ by n[^'^ = 1 < N^^^ < • • • < Njf^ < ■ • ■ . We 
say that A C has URI-density of order 2 equal to a if 

n ^ \ nI^^J n^oo 

We define the URI-density of order £ in the same way, averaging along the subse- 
quence n'^^I^ 

We can also introduce the infinite iteration of the URI-method, considering 
almost-sure limsup and liminf in the above expressions, and see if they converge to 
the same limit as ^ — > oo. 

Although we have shown that URI-density and log-density coincide, it is not 
obvious if there are connections between iterated URI-density and iterated log- 
density. Can URI-densities of finite order £ be strictly stronger than URI-density? 
Can we compare the infinite iteration of both methods? 
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